World of Education

home *** CD-ROM | disk | FTP | other *** search

/ World of Education / World of Education.iso / world_n / nptr14.zip / TRINST.DOC < prev next >

Wrap

Text File | 1993-08-25 | 18KB | 393 lines

------------------------------------------------------------ TRAINER (From Passive Pupil to Self-Correcting Scholar) ------------------------------------------------------------ INDEX TRAINER . . . . . . . . . . . . . . . . . . . . 1 Sample Answer Card . . . . . . . . . . . . . . . 2 Sample Test Instructions and Selected Answers . 3 Short Form Item Analysis . . . . . . . . . . . . 4 Converting Raw Scores to Grades . . . . . . . . 5 TRAINER Programs and Files . . . . . . . . . . . 6 Bibliography and Support . . . . . . . . . . . . 7 ------------------------------------------------------------ ------------------------------------------------------------ TRAINER 1/7 (From Passive Pupil to Self-Correcting Scholar) ------------------------------------------------------------ TRAINER was started in 1981 to score tests by quality and quantity for underprepared college students. The test grades needed to reward the development of: 1. Good study habits (as required by an essay test). 2. The sense of responsibility needed to learn at higher levels of thinking. 3. The self-judgment required for a self-correcting scholar. The qualitative score (percent right) is the feel-good score or the you-are-on-the-right-track score. It indicates to what extent students know their own minds, their self-judgment. The combined qualitative and quantitative score is the test grade. The minimum answer sheet had to have three options: A. GUESS: The traditional guess test style, random guess, answer every question, that encourages the use of lower levels of thinking. B. KNOW: Mark only when confident the answer is an acceptable report of what is known or reasoned, that encourages the use of higher levels of thinking. C. Both A and B: A concrete demonstration that each student can voluntarily perform. A comparison between the familiar, low performance pupil, and the unfamiliar, high performance student. (When given only the KNOW option, students complained, "Why can't we guess here as we do on tests all over campus?") This concrete comparison was absolutely essential for students to evaluate the two test styles. By the third hour test each semester over 90% selected only the KNOW style: I can spend my time on the questions I know something about. It is honest, I am not forced to guess (to lie). I try to master what I am studying rather than memorize everything. I get better grades now with less study time. ---------------------------------------- Sample Answer Card 2/7 ---------------------------------------- NAME _______________________________ COURSE _____________________________ I [ ] 0 1 2 3 4 5 6 7 8 9 D [ ] 0 1 2 3 4 5 6 7 8 9 [ ] 0 1 2 3 4 5 6 7 8 9 N [ ] 0 1 2 3 4 5 6 7 8 9 U [ ] 0 1 2 3 4 5 6 7 8 9 M [ ] 0 1 2 3 4 5 6 7 8 9 B [ ] 0 1 2 3 4 5 6 7 8 9 E [ ] 0 1 2 3 4 5 6 7 8 9 R [ ] 0 1 2 3 4 5 6 7 8 9 ANSWERS 1 [A][B][C][D][E] 51 [A][B][C][D][E] 2 A B C D E 52 A B C D E 3 A B C D E 53 A B C D E 4 A B C D E 54 A B C D E 5 A B C D E 55 A B C D E \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ 46 A B C D E 96 A B C D E 47 A B C D E 97 A B C D E 48 A B C D E 98 A B C D E 49 A B C D E 99 A B C D E 50 A B C D E 100 A B C D E ---------------------------------------- Based on card F-4000-9 printed by the Clearview Printing Co. Inc. ---------------------------------------------------------------------- Sample Test Instructions and Selected Questions * 3/7 ---------------------------------------------------------------------- GENERAL BIOLOGY 102, SECTION 1, TEST #1 12-Jan-89 09:43 AM VALID ANSWER CARD: Name, Student Number and Seat Number You have the option of (A) answering all questions (GUESSing if you do not know), (B) reporting (marking) only what you KNOW or can reason, or (C) marking both methods. Use the left side 1-49 for (A). Use the right side 51-99 for (B) (A) SCORING: 0% for self-judgment, +2 for right, 0 for wrong (GUESS if you do not know.) (B) SCORING: 50% for self-judgment, +1 for right, -1 for wrong (Report only if you are confident of being right.) (C) SCORING: Mark (A) on the left AND (B) on the right side. MARK ANSWER 50: A) GUESSing B) Reporting what I KNOW or can reason C) Average of A and B 4. Ponds do not freeze from the bottom up because: A) ice has a greater volume than the equivalent amount of water B) the specific heat of water is low C) ice is more dense than liquid water D) of the high surface tension of water 6. How is the arrangement of water molecules in ice different from their arrangement in liquid water? The arrangement in ice is: A) more random and more open than that in water B) more regular and more open than that in water C) more regular and more compact than that in water D) more random and more compact than that in water 49. Next test in ( ) weeks. A) 1 B) 2 C) 3 D) Instructor's choice ---------------------------------------------------------------------- Classes tend to select two weeks. Concrete level of thinking pupils do not like weekly tests as they require "too much studying" and tri-weekly tests require "too much to study". Informational questions can be inserted at any point in the test. They are only tabulated. They have no effect on scores. Normally a 50 minute "hour" test is limited to 49 or fewer graded questions. There must be time to think, if thinking at higher levels of thinking is to apply. If more questions are needed, the program will accept a card marked with up to 99 answers with position 100 marked A for GUESS or B for KNOW. * Test Bank by A. C. Monroe, D. J. Fox, and J. J. Cockerill, 1985, Worth Publishers, Inc. ---------------------------------------------------------------------- Short Form Item Analysis 4/7 ---------------------------------------------------------------------- 1...5...10 ....5...20 ....5...30 ....5...40 ....5...50 ANSWER KEY: CBBAABABCC CBCBCDCAAB ACEEBCAEBA ADBDCBADDE ADBDDCDD ITEM WEIGHTS: 1111111111 1111111111 1111111111 1111111111 111111110 DIFFICULTY2: 5 4C C 35 5C664 5 a b 0 DISCRIMINATION: T T T 1 1551 111 1 1 T111 11TT1 1 ---------------------------------------------------------------------- DIFFICULTY2 (Questions that failed to perform well.) A question that received a total performance value of less than 75% (the sum of the difficulty, K x E, value for those who answered and the difficulty, Item Score, value for the entire class). Upper Case = the most popular wrong answer with more than half of the class marking. These items need to be reviewed in class. In the above example, questions 4 and 6, usually fail to perform well because pupils associate DENSE and COMPACT with HARD even though they know ice floats on water. Lower case = most popular wrong answer with less than half of the class marking. Number X 10 = percent of the class marking when the right answer is the most popular. No one answered item 48. There were 15 items of questionable validity among the 48 scorable items. The TRGRADES program can assign grades on the basis of the 33 (48 - 15) questions that make up the true test out of the 48 items presented to the class, unless the instructor has some justification for doing otherwise. No test had a set of questions that all performed well during an eight year period. Lacking a functional item analysis for each test, most faculty members will not admit that a number of questions on each test are not valid for each class. Testing services use expert inference based on multiple tests to determine validity. This design fails to address the instructional validity of items on any one test in any one class. To be instructionally useful, an analysis must respond to each class and each test. The analysis is further improved when based on student reports of what they know or can do rather than on the random guessing encouraged by the traditional use of multiple-choice questions. DISCRIMINATION (Questions that differentiate between those students who did well on the test and who did not do well.) Discrimination is expressed in probabilities: Results that could have happened by chance alone at or less then 1%, 5% and Ten% of the time (best, better, and good). These questions generate most of the score distribution. ---------------------------------------------------------------------- Converting Raw Scores to Grades 5/7 ---------------------------------------------------------------------- A score distribution can be changed in two ways: 1. Shift Sliding the score distribution to the right or left on a grade scale. 2. Stretch Expanding or contracting the score distribution on a grade scale. TRGRADES permits an instructor to repeatedly modify the score distribution until the desired grade distribution is obtained. The validity of the modifications is related to the way they are done. 1. Norm- The weight of all "bad" items can be used Referenced (1/3 shift and 2/3 stretch for KNOW tests). This tends to produce a grade distribution (Automatic) similar to the normal curve with the exception of a few students receiving scores of over 100%. These are the students who answered correctly more items than the class, as a whole, determined were valid. This system rewards outstanding performance at no penalty to others in the class (almost the reverse of curving guess-test scores). 2. Criterion- "Bad" items are inspected and then either Referenced accepted as bad or considered items that the class is held responsible for within the instructional system (lecture, reading assigment, laboratory, home work, projects, etc.). These items need to be discussed with the class. 3. Inspection About any favorite grade distribution can be obtained by a combination of shift and stretch. Option 2 is the most valid use of the program. Using the short form item analysis and a copy of the test, a determination can be made in about 10 minutes on a 48 question test. The program also removes two common faculty worries: 1. The test will be too hard or too easy. There is no need to attempt the feat of selecting questions with the goal of obtaining a raw score distribution that will also be the grade distribution. 2. Bad questions that will require adjusting scores. This system actually needs a few such items just to keep students using higher levels of thinking. The idea that all questions are equally valid for each student and for each class to answer is an academic delusion. Bad items will always be there, in part, due to the missmatch of student, teacher, and evaluator operating at different levels of thinking. There is no need to intentionally create them. ---------------------------------------------------------------------- TRAINER Programs and Files 6/7 ---------------------------------------------------------------------- PROGRAMS: TR main menu TRAINER scores for quality and quantity TRGRADES converts raw scores to grades TRREF referees independent marking TRERROR corrects card reader errors BRT71EFR PDS BASIC run-time module 7.1 FILES: TRAINER Input: Answer Data File, positions 1-9 = I.D. field positions 11-100 = answers (See TRSAMPLE.DOC for an example) Output: PRINT .FIL: Individual score slips Ranked scores Histogram Class plot (quality & quantity) Item analysis GUESSCOR.FIL: GUESS test scores KNOWSCOR.FIL: KNOW test scores TRGRADES Input: GUESSCOR.FIL and/or KNOWSCOR.FIL Output: GRADES .FIL TRREF Input: Answer Data File Output: BARGUESS.FIL Answer bar graphs that BARKNOW .FIL supplement the item analysis. SIMGUESS.FIL Similarity check. SIMKNOW .FIL CONGUESS.FIL Uniqueness check and collation CONKNOW .FIL of similarity and uniqueness to confirm independent marking. INFORMATION FILES: TRSAMPLE.DOC Set of 37 answer cards. TRINST .DOC This instruction file. WARRANTY.DOC Warranty and distribution. REGISTER.DOC Registration of use. File names are designed to allow one DELETE *.FIL command to remove all temporary files from the directory. All files to be saved must be renamed with a different extension than .FIL. Also see README.DOC and UPDATE.DOC. ------------------------------------------------------------ BIBLIOGRAPHY and SUPPORT 7/7 ------------------------------------------------------------ Hart, Richard A. 1981. Evaluating and rewarding student initiative and judgement or an alternative to "sitting through" a course if you did not test out. Pages 75-76 in Directory of Teaching Innovations in Biology. Meeth, L. R. and Dean S. Gregory, Ed. Studies in Higher Education:Arlington, Virginia. 252 pages. Hart, Richard and Kenneth Minter. 1985. Using a computer to manage typical classroom problems. National Science Teachers Association Annual Meeting, Cincinnati, Ohio 18-21 April. Minter, Kenneth and Richard Hart. 1986. Essay testing using multiple choice questions. Missouri Academy of Science Annual Meeting, Warrensburg, MO 25-26 April. Hart, Richard and Kenneth Minter. 1988. Diagnostic Testing Using Multi-Choice and Matching Questions. National Science Teachers Association Annual Meeting, St. Louis, MO 7-10 April. Minter, Kenneth and Richard Hart. 1989. Student Choice in Computer Graded Tests. National Science Teachers Association Annual Meeting, Seattle, Washington 6-9 April. Hart, Richard and Kenneth Minter. 1991. Student Choice in Multiple-Choice Testing. National Science Teachers Association Annual Meeting, Houston, Texas 27-30 March. ------------------------------------------------------------ Program support is available from Nine-Patch Software, 315 South Alco Ave., Maryville, MO 64468-2033 for registered users. (Else include a stamped and self-addressed envelope.) Phone 816-582-8589 CIS 71222,3565 Assistance in adapting higher levels of thinking to existing instructional programs is available. Of interest are student and teacher workshops and demonstrations in which the participants experience the concepts as well as learn about them. Richard A. Hart, Ph.D. 315 South Alco Avenue, Maryville, MO 64468-2033 ------------------------------------------------------------